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'Just Google it' - the scope of freely available information sources 

for doctoral thesis writing 


Vincas Grigas. Simona Juzeniene and Jone Velickaite . 


Introduction. Recent developments in the field of scientific information resource provision 
lead us to the key research question, namely,what is the coverage of freely available 
information sources when writing doctoral theses, and whether the academic library can 
assume the leading role as a direct intermediator for information users. 

Method.Citation analysis of doctoral theses was conducted in the summer of 2015. A total of 
thirty-nine theses (with 6,998 references) defended at Vilnius University at the end of 2014 
was selected (30 per cent of all defended theses). Theses were randomly chosen from 
different research fields: the humanities, social sciences, biomedical sciences, technological 
sciences, and physical sciences. 

Analysis. The research team was tasked with identifying whether certain resources could be 
found in the eCatalogue of an academic library, its subscribed databases, freely available 
online (through Google or Google Scholar), or whether the resources from the library's 
subscribed databases are identical to those which are freely available. The data gathering 
process included such resource categories as journal papers, printed and electronic books or 
book chapters, and other documents (legal reports, conference papers, newspaper articles, 
Websites, theses, etc.). 

Conclusions. Library collections and subscribed databases could cover up to 80 per cent of 
all information resources used in doctoral theses. Among the most significant findings to 
emerge from this study is the fact that on average more than half (57 per cent) of all utilised 
information resources were freely available or were accessed without library support. We 
may presume that the library as a direct intermediator for information users is potentially 
important and irreplaceable only in four out often attempts of PhD students to seek 
information. 

Introduction 

The emergence of Web search engines changed the way scientific 
information is searched for ( Ortega. 2014! . Library-owned search engines, 
as well as database search engines, are no longer the first choice for 
information users in searching for scholarly literature f Cothran. 2011: 

Jamali and Asadi. 2010: Rowlands et al.. 2008: Sapa and Krakowska. 2014L 
Google has the highest impact on Web searches as it is the most visited 
Website globally fAlexa. 2016L whereas Google Scholar indexes the highest 
amount of scholarly literature globally fKhabsa and Giles. 2014L Google 
and Google Scholar index full texts or metadata of all kinds of scholarly 
literature across an array of publishing formats. 

Google Scholar is growing in size year by year. As of August 2010, Google 
Scholar could contain 86 million documents fAguillo. 2012) : in English 
only, as of January 2013, it contained 99.3 million documents fKhabsa and 



















Giles. 2014) : as of December 2013,109.3 million documents ( Ortega. 2014b 
and as of May 2014,111.15 million documents fOrduna-Malea. Avllon. 
Martin-Martin and Delgado Lopez-Cozar. 201 A) . 

Google and Google Scholar aid in finding two types of the most often used 
scholarly literature, for example, peer-reviewed journals and so-called grey 
literature. Grey literature covers documents which are not formally 
published by academic publishers, but can be important in systemic and 
evidence-based reviews. It includes various kinds of reports, working 
papers, white papers, evaluations, government documents, theses, 
conference proceedings, pre-prints, post-prints, newsletters, and laboratory 
research books. A full list of document types featured in grey literature is 
offered on the Grey Net International Website (Grey Net International. 
2016b As a matter of fact, in most cases where grey literature is on the Web 
it is freely available. Grey literature plays an important role in the 
communication of scholarly information as it is available and accessible at a 
great scale owing to widespread scholarly social networks and institutional 
repositories. 

The Web of Science database is limited in its ability to represent the full 
extent of grey literature because of its restricted scope, therefore Google 
Scholar and Google evidence the use of more up-to-date information 
available on the Web to a larger extent than citation in Web of Science will 
reveal fHutton. 2000. p. til . It has been detected that Google Scholar 
results contain moderate amounts of grey literature, with the majority of its 
instances presented on page eighty on average. It has also been ascertained 
that when searched for specifically, most of the literature identified using 
Web of Science could also be found using Google Scholar (Iladdawav. 
Collins. Coughlin and Kirk. 2omb 

There are arguments with regard to the quality of the pre-print versions 
which are freely available on the Web fKlein. Broadwell. Farb and 
Grappone. 2016b Comparison of the published scientific journal papers 
with their pre-print versions revealed that generally there were few changes 
in the content of the scientific papers as compared to their pre-print 
versions. 

An important aspect of Google Scholar is that almost 50 per cent of its 
content is available off-campus fKhabsa and Giles. 2014b Another study 
revealed that out of sixty-four thousand highly cited documents in Google 
Scholar approximately 40 per cent of it can be accessed freely fMartin- 
Martin. Ordima-Malea. Avllon. Delgado Lopez-Cozar and Lopez-Cozar. 

2014b A recent study published in 2015 suggests that 61.1 per cent of full- 
text scientific papers found with Google Scholar were freely available off- 
campus fLaakso and Lindman. 201M . The latest research revealed that 
approximately 60 per cent of all published scientific papers were found to 
have an open-access copy available . 

The number of journals offering unrestricted access to their content has also 
been growing. For instance, analysis of delayed open access journal papers 
exhibited that 77.8 per cent of these papers became open access within 
twelve months from publication, with 85.4 per cent becoming available 
within twenty-four months fLaakso and Biork. 2012b Bjork, Laakso, 



















Welling and Paetau (2014) established that a synthesis of previous studies 
indicated that green open access coverage of all published journal papers 
was approximately 12 per cent, with substantial disciplinary variation. 
Another study in this field, carried out by White f 20141 . suggests that 
approximately 30 per cent of papers are freely accessible in their year of 
publication, rising to nearly 40 per cent in the following years, and 
repositories are responsible for approximately 50 per cent of freely available 
papers. As of April 2014, more than 50 per cent of the scientific papers 
published in 2007, 2008, 2009, 2010, 2011, and 2012 can be downloaded 
for free on the Web (Archamhault et al.. 2014! . The latest research revealed 
that approximately 60 per cent of all published papers have an open access 
copy available (Laakso and Lindman. 2016! . This may be a positive result of 
the increasing interest in promoting open access to scientific publications 
and research data, thus resulting in the growing free of charge access to 
scientific publications for any user fArchambault et al.. 2014 : European 
Commission. 2016! 

The increasing use of Web-based search engines and widespread freely 
available full-text literature on the Web are indications of increased 
possibilities to have access to scholarly literature without using library 
subscribed databases or library local collections. Recent developments in 
the provision of scientific information resources lead us to the key research 
question, namely what is the coverage of the freely available information 
sources when writing doctoral theses, and whether the academic library can 
assume the leading role as a direct intermediator for information users. 

Specifically in this exploratory pilot study we seek to address the following 
questions: 

RQi. How important are library information resources in compiling 
material for doctoral theses? 

RQ2. What types of information sources are most often used when writing 
doctoral theses? 

RQ 3 . What are the potential ways of getting certain information resources 
cited in doctoral theses? 

Literature review 

Many scholars hold the view that libraries act as an intermediator between 
information resources and information users and the key role of the library 
is to serve the users’ needs fBrophv. 2000: Mieziniene and Prokopcik. 

2000 : Wilson. iqq8~) . As suggested by the Generic Library Model fBrophv. 
2000! (model is reproduced in Figure 1), the library is viewed as an 
intermediator between the user and information resources which are 
potentially available to that user, as expressed through Information Use and 
Access Processes (centre green rimmed box). The intermediation may be 
defined as a process where the library enables particular users ( User 
Population ) through a User Interface to gain access to the required 
information ( Information Population ) through a Source Interface. 
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Figure 1: Generic library model (Source: Brophy, 2000). 

This paper offers a critical examination of the assumption that the library 
may serve as an intermediator for information users in the process of 
information use and access. According to Brophy (2000), a much debated 
question is whether the library and information services, in any 
recognisable form, will be in demand in the new millennium and whether 
access to information resources will remain among the reasons for visiting 
libraries, as the preceding studies suggest that information and 
communication technologies have a significant impact on the library and 
the information sector. 

With the role of the academic library defined, let us move on to the 
discussion of whether changes in the access to freely accessible full-text 
information resources may challenge the role of the library as an 
intermediator, because use of information has undergone certain changes. 

Van Noorden 12014) surveyed more than 3,500 responses from ninety-five 
different countries. His research suggests that more than 60 per cent of 
researchers representing science and engineering, and more than 70 per 
cent of those representing social sciences and the humanities, are aware of 
and regularly visit Google Scholar, and more than 90 per cent of them have 
knowledge of Google Scholar. Another study implemented by Boyum and 
Aabo (20i5)concludes that Google Scholar is perceived as a highly 
convenient instrument and has been extensively used among business PhD 
students in Norway. Google and Google Scholar help people find 
information across the Internet and are free of charge. As a result, the latter 
is widely used when searching for scholarly information, whereas Google is 
often seen as a starting point and it is quite usual for PhD students to end up 
their search on Google as well IConnawav. White. Lanclos and Le Cornu. 
2012: Jamali and Asadi. 2010) . The research suggests that science and 
technology students are more likely to use Google Scholar than their peers 
representing the humanities and social sciences IWu and Chen. 2014) . 
Moreover, it has been established that the scientific community has been 
widely using social media to obtain scientific papers directly from colleagues 
IKjellberg. Haider and Sundin. 2016: Laakso and Lindman. 2016) 

A study conducted by Vezzosi 1 2000 ) suggests that PhD students rely 
heavily on the Web for their research work, however, their use of the library 
is limited to document delivery and interlibrary loan. Talking about PhD 























students’ information practice, as Carpenter (2012) has identified, it is 
typical of them to be satisfied with the abstract where they cannot get the 
full-text scientific paper. Interestingly, Gullbekk, Rullestad and Carme 
Torras Calvo (20i3)observed that PhD students indicate easy access to full- 
text scientific papers as the most important aspect when choosing 
information resources, they are keen on using freely available electronic 
full-text information sources (use Google a lot), and have reduced utilisation 
of printed information sources. Another interesting finding suggests that 
PhD students cite conference proceedings and journal papers more often 
than the faculty does ('Lariviere. Sugimoto and Bergeron. 2012!. for 
example, PhD students are apt to cite a large variety of formats, including 
conference papers, technical reports, and government documents fCondic. 
2015). The said types of documents are more easily accessible using Google 
Scholar than subscribed databases, for instance, Web of Science (Khabsa 
and Giles. 2014k 

When colligated, the above results support the idea that the printed 
collections and subscribed databases of the academic library are gradually 
decreasing in their importance for information users because more and 
more full-text information resources may be found on the Web using 
generic search engines such as Google and Google Scholar. It supports the 
idea that library collections and subscribed databases could potentially be 
replaced by freely available full-text information sources accessed through 
generic search engines. 

For decades, collection development and management was among the key 
roles of the academic library. Today, however, the situation is far from being 
stable, as the infosphere is undergoing rapid changes, thus reshaping our 
traditional ways of information behaviour IFloridi. 2014L This point of view 
is supported by fDelanev and Bates. 20ml who write that increased 
competition from other information providers, such as Google and Amazon, 
decline in the use of the Online Public Access Catalogue, changes in user 
activities, people's engagement and interaction with the library and its 
resources, are but a few potential challenges to the academic library. 

Librarians have started looking for new ways to act. Petraityte (2013) 
showed that it is obvious that the academic library is actively searching for 
its place in the chain of scientific communication and information, and its 
future scenarios are being discussed. In her study, Petraityte (2014) 
highlights that a number of authors dwell on the significance of the role of 
strategic partnership and cooperation; describes the role of the academic 
library as a proactive disseminator of innovation within the mother 
institution positions the academic library as the leader of the usage and 
application of information technologies at a university; and discusses the 
functions of publishing,scientific data curation and dissemination assumed 
by the academic library. 

The said roles of the academic library are rather new and not all members of 
university staff accept them. Petraityte ( 2013 ) points out that the traditional 
point of view on the library’s role as an information source provider is still 
viable. Recent developments in scientific communication have heightened 
the need for the research which would disclose whether researchers have the 














option of obtaining the necessary information sources without using library 
collections or library subscribed databases. There are no published data on 
how many of the freely accessible full-text information sources PhD 
students could potentially use in their main written assignment without 
availing themselves of their university library services. 

With a view to answering the above-posed question, the authors resolved to 
implement an exploratory study of information resources utilised when 
writing doctoral theses. The authors of this pilot study have opted for 
citation analysis to discover how PhD students could successfully write their 
theses without using any library services. Citation analysis is a well- 
established approach in social sciences. 

In recent years, two different approaches have been employed for citation 
analysis: a) to measure the use of library collections fEnger. 2000 : 
Fevereisen and Spoiden. 2000 : Kumar and Dora. 2011 : Tonta and Al. 

2006); and b) to assess citation habits fEchezona. Okafor and Ukwoma. 
2011 : Emerson. 20m : Kaczor. 2014 : Keogh. 2012 : Kuruppu and Moore. 
2008: Sudhier and Kumar. 2010) . Our idea was to use citation analysis to 
evaluate how useful freely available full-text information sources can be 
when writing PhD theses. The said measuring could help us determine to 
what extent the academic library may be important to PhD students as an 
information resource provider, and to collect further evidence on how 
strong the role of the academic library as an intermediator could be. 

Method 

With a view to addressing the research questions, citation analysis of thirty- 
nine doctoral theses (30 per cent of all theses defended at Vilnius University 
at the end of 2014) was conducted in the summer of 2015. These theses were 
randomly selected from different fields and branches. 

Social sciences. Twelve out of the thirty-nine defended theses were 
selected for the research representing seven different branches: 
management (two theses), political science (two), communication and 
information (one), law (one), economics (two), sociology (two), and 
psychology (two). 

Biomedical sciences. Ten out of the thirty-five defended theses were 
selected for the research, representing four different branches: biophysics 
(two), botany (one), medicine (five), and biology (two). 

Technological sciences. Two out of the six defended theses were selected 
for the research, representing one branch: computer engineering (two). 
Physical sciences. Ten out of the thirty-four defended theses were 
selected for the research, representing six different branches: mathematics 
(one), chemistry (two), biochemistry (two), physical geography (two), 
informatics (one), and physics (two). 

The humanities. Five out of the sixteen defended theses were selected for 
the research, representing two different branches: philosophy (two), and 
philology (three). 

The total of 6,998 bibliographical references was collected. Thesis reference 
lists were used to identify the cited resources. Every item from the lists was 
subjected to dual analysis - on-campus and off-campus to establish the 
quantity of utilised freely available resources. Twelve criteria, which fall into 















two groups, were employed to analyse each item on the lists. 

Part one. The research team tried to identity whether certain resources can 
be found in the library’s eCatalogue (i.e. whether the library is in possession 
of those particular resources), in the library’s subscribed databases (for 
example, journal papers, e-books), or whether these resources in full text 
can freely be found online off-campus by merely using Google or Google 
Scholar. 

Part two. The data gathering process also included identifying resource 
categories such as peer-reviewed papers, printed and electronic books or 
book chapters, reports and studies, conference papers, newspapers and 
other not peer-reviewed papers, Websites, theses (postgraduate degree 
theses, including master’s and doctorates), and other (any search record 
that could not be categorised according to the above classification). 

Descriptive statistical analysis for qualitative variables was employed 
(percentage was calculated). The calculation procedure was as follows: 

• types of used information sources; 

• use of peer-reviewed papers; 

• use of e-books and books; 

• potential ways of getting information sources; 

• freely available information sources identical to those found in 
subscribed databases. 

It should be noted that the Potential ways of accessing information sources 
section of the Results lists research data duplicates, as identical information 
sources were available in library eCatalogs, subscribed databases and on the 
Web. This suggests that one and the same information source could at the 
same time potentially appear in all three categories. 

The statistical analysis was performed using SPSS software (version 21). 

The Shapiro-Wilk statistical test was employed to evaluate the normality of 
data, whereas the differences in the means of the independent groups were 
analysed applying the Kruskal-Wallis H test and the One-way Anova 
method. 

Results and discussion 

Note. The amount that was not covered by whole numbers was measured in 
decimals. In an attempt to implement a consistent description of research 
results all numbers were left fractional. 

Types of information sources used. 

The Shapiro-Wilk test was employed to assess the normality of data. Data 
on the types of information were not normally distributed among all types of 
information sources - significance value of the Shapiro-Wilk Test was 
lower than 0.05. A nonparametric test, the Kruskal-Wallis H test, was used 
to find out if there were statistically significant differences between the 
types of information sources. The Kruskal-Wallis H test revealed significant 
differences between fields and utilised information sources. It supports the 


findings of Fry, Spezi, Probets and Creaser (2015) and Jamali and Nicholas 
(200S) that different disciplines may potentially exhibit different 
information users’ behaviour. 

The Mean Rank numbers resemble the results as provided in the percentage 
format, therefore the latter form will be used for a more convenient 
reflection of results. Significant differences in information sources 
(significance value lower than 0.05) are observed in peer-reviewed papers, 
e-books, printed books, reports, studies, and conference papers. 

As indicated in Table 1 the most popular types of information resources are 
scholarly electronic journals - 49.81 per cent (a very small number of all 
used journals were printed ones), books - 26.46 per cent, and other not 
peer-reviewed periodicals - 6.05 per cent. Less popular source types are as 
follows: e-books - 4.61 per cent, conference papers - 3.99 per cent, 
Websites - 1.88 per cent, reports or studies - 1.42 per cent, and theses - 
1.09 per cent. The term Other covers sources which include legal 
documents, maps, companies’ reports, blogs, etc. which make up 4.69 per 
cent of all information used in all theses. These numbers correlate with 
previous research findings (Lariviere et al., 2013) where it was established 
that other theses were also the least cited information resource in the 
process of thesis writing. Moreover, there is a significant difference between 
electronic and printed resources which suggests that the electronic format is 
more common in most research fields. 


Scientific 

fields 




Information source types 





Peer- 

reviewed 

papers 

e- 

books 

Printed 

books 

Reports, 

studies 

Conference 

papers 

Newspapers 
and other 
papers 

Websites 

Theses 

Other 

Social 

Sciences 

42.68 

3.15 

28.47 

3.79 

1.97 

9.14 

2.58 

1.02 

7.20 

Humanities 

14.77 

8.51 

66.93 

0.16 

0.96 

4.17 

0.32 

1.61 

2.57 

Technological 

sciences 

33.47 

9.92 

17.77 

2.07 

14.05 

9.09 

3.31 

1.65 

8.68 

Physical 

sciences 

69.89 

0.53 

15.45 

0.85 

2.70 

4.60 

2.50 

0.92 

2.56 

Biomedical 

sciences 

88.22 

0.96 

3.69 

0.23 

0.27 

3.23 

0.68 

0.27 

2.46 

Arithmetic 

mean 

49.81 

4.61 

26.46 

1.42 

3.99 

6.05 

1.88 

1.09 

4.69 


Table 1: Distribution by information source types (percentage) 

Use of peer-reviewed papers 

As indicated in Figure 2, our research helped establish that peer-reviewed 
papers were the most often used type of information, which correlates with 
findings presented in other studies fCarpenter. 2012 : Niu. Flemminger. 
Brown. Powers and Tennant. 2010 : Rowlands et al. 2008 : Tenopir et al. 
2010 : Tenopir. King. Christian and Volentine. 2015: : Tenopir. King. 

Edwards and Wu. 2000) . PhD students of biomedical and physical sciences 
are substantial users of peer-reviewed journals: 88.22 and 69.89 per cent 
respectively, of all information utilised in their theses was derived from 
electronic journals. 































This reflects to the work of Nicholas, Clark, Rowlands and Jamah 1 2000 ). 
which suggests that researchers of life sciences are among the most frequent 
users of journal papers. The percentage of electronic journals utilised by 
students representing the field of social sciences makes up 42.68 per cent. 
They are followed by representatives of technological sciences with 33.47 
and the humanities at 14.77 per cent. 

The percentage pertaining to electronic journals suggests that students of 
technological sciences rely on books more than those of biomedical or 
physical sciences. Multiplicity of resource types used in this scientific field 
may be accountable for the fact. The available data suggest that PhD 
students representing technological sciences gather necessary information 
from conference papers (14.05%), Websites (3.31%), and reference theses 
(1.65%) more actively than those who opted for other sciences, which 
suggests that students of technological sciences are likely to make use of 
more diverse resources, from various sources, than their peers. 

There is nothing surprising in these numbers as they are consistent with 
earlier and current research from around the world. In 2002, King and 
Montgomery (2002), scientists of Drexel University, Philadelphia, 
conducted a research study aimed at finding out who read more electronic 
journals - the faculty community or PhD students. The gathered data 
helped establish that average reading per person and average time spent on 
reading publications was almost 20 per cent higher among PhD students. 
Moreover, PhD students not only read more electronic journals, but also 
cited them more in their papers than other faculty members fLariviere et al.. 
2013). A recent study corroborated the above-described statistics which 
manifest steady increase in the numbers. Researches helped disclose that 
the most active electronic journal readers are PhD students from all 
disciplines (exclusive of the humanities) fMohammadi. Thelwall. Haustein 
and Larivieere. 20 lM . They read almost four times more than 
representatives of any other academic group. 
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Figure 2: Use of peer-reviewed papers (percentage) 

Use of books and e-books. 

Figure 3 exhibits the findings of our research, which indicate that books and 
e-books are highly prevalent among PhD students of the humanities, while 
for other disciplines the percentages are lower. These results correlate with 











































those of the research implemented by Brown and Swan (2007). Among the 
limitations of books as a type of information source is their format, as users 
tend to show considerable preference for electronic full-text offerings 
fBrown and Swan. 2007 : Tenopir et al., 2010. 2013). 

80 

70 66.93 



Social sciences Humanities Technological sciences Physical sciences Biomedical sciences Arithmetic mean 

■ Books ■ eBooks 


Figure 3: Use of books and e-books (percentage) 

It should be noted, however, that the use of e-books is still relatively 
insignificant. We can presume that the key problem resulting in the meagre 
use of e-books is related to the insufficient quantity of high quality texts 
published in this format. Several issues pertaining to the publishing of 
academic e-books were highlighted by the librarians of Alabama University 
(Walters. 2012) . They were tasked with drawing up the mandatory literature 
list for medical students consisting exceptionally of e-books. Apparently that 
was impossible because most of the required books were not issued in any 
electronic format. In the paper, Walters (2012 also highlights other 
problems related to the supply of and demand for academic e-books: there 
is an apparent shortage of high quality academic literature from the 
publishers’ side and most e-books are published from three to eight months 
later than their paper versions, which is too long, given the rate of issue and 
quantities of new scientific production in some of the research fields. Hence, 
we can assume that this is the reason why the role of e-books in thesis 
writing is so insignificant in such fields as physical (0.53%) and biomedical 
(0.96%) sciences. Today, the most active users of e-books are PhD students 
of technological sciences. This correlates with printed books which the latter 
use more frequently than students of physical sciences and over four times 
as frequently as PhD students of biomedical sciences. In the field of 
technological sciences the difference in the utilisation of printed books and 
e-books is also the most insignificant - a mere two times. To perceive the 
relative insignificance of this number we can compare it with that of 
physical sciences where the difference between printed books and e-books is 
as much as twenty-nine times. The gathered data suggest that books are 
actively used in technological sciences which is labelled as one of the fastest 
evolving sciences, however to find out the precise reasons why e-books in 
particular are so usable in this field rather than in other fields, demands 
further analysis. 

Potential ways of accessing information sources. 


Three potential ways of accessing information resources have been 


























subjected to analysis: library eCatalogues, subscribed databases, and freely 
available full-text information resources. The assessment of the normality of 
data was based on the Shapiro-Wilk test. The test revealed that the groups 
of library eCatalogue (Sig. o.ooo) and unknown sources (Sig. 0.001) were 
distributed normally, however, subscribed databases (Sig. 0.318) and freely 
available sources (Sig. 0.540) were not normally distributed. For further 
analysis of the results non-parametric and parametric methods were 
employed through the application of the Kruskal-Wallis H test and the 
One-way An ova test. The authors’ decision to resort to the parametric test, 
namely the One-way An ova test, for further analysis was determined by the 
fact that the results of the analysis based on this test were more consistent. 
The parametric test was employed to determine whether there were any 
statistically significant differences in the potential way of getting access to 
information sources. 

The following statistically significant differences in the potential ways of 
accessing information sources were determined - eCatalogue (Sig. 0.000); 
subscribed databases (Sig. 0.000); freely available (Sig. 0.001) and 
unknown sources (Sig. 0.077). These results suggest a significant difference 
between the use of printed and electronic resources. With a view to 
analysing the disciplines in which the differences were most prominent, the 
post hoc comparison was conducted by means of the Tukey HSD test. The 
test indicated that though the difference in social, technological, physical, 
and biomedical sciences was insignificant, it became rather significant as 
compared to the humanities. Therefore, PhD students representing the 
humanities are among the most active users of the library’s printed sources. 
Analysis of electronic resources, however, revealed opposite results - 
students of all fields were actively using electronic resources provided by the 
library with insignificant differences, with the exception of those 
representing the humanities - the latter being the least active group. 
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Figure 4: Potential ways of accessing information sources (percentage) 


Over 14 per cent of information sources used by PhD students could 
potentially be found in the library eCatalogue. As indicated in Figure 4, PhD 
students representing the humanities were able to find the biggest share of 
needed information in the library eCatalogue (41.16%), as compared to 
students of other research fields - technological (6.81%), physical (6.51%) 
and biomedical sciences (3.05%). This correlates with the type of most 
popular information resources in each field - representatives of the 





























humanities mostly use paper books, which means that they are more likely 
to find necessary books in library stacks searching for them through the 
eCatalogue, while PhD students of biomedical science are apt to find more 
relevant information in electronic format, therefore the percentage of paper 
books and eCatalogue usage here is notably smaller. Physical and 
technological sciences are renowned for the constant update of information 
which makes it a challenge for libraries to follow. 

Over 27 per cent of information sources used by PhD students (mostly peer- 
reviewed papers) could potentially be found through Vilnius University 
Library subscribed databases. Almost half (43.16%) of the necessary 
information PhD students of biomedical sciences could potentially find in 
subscribed databases; a similar situation was with those representing the 
physical science (40.61%). In constrast, for the humanities, the main source 
of information was printed books (41.16%) searched for through the 
eCatalogue. 

Over 35 per cent of information sources used by PhD students could 
potentially be found free off-campus. Free off-campus access was potentially 
available to almost 57 per cent of all information sources used by PhD 
students of technological sciences and more than 40 per cent of those 
utilised by PhD students of biomedical sciences. In the latter case it almost 
equals the share of information sources which they could potentially access 
using subscribed databases. 

On average more than half (57%) of all utilised information resources were 
freely available or could be accessed without using the collections of the 
home library (unknown potential ways of accessing information sources). It 
is approximately 40 per cent of the higher quality information resources 
(excluding Websites and articles from newspapers or blogs) used. 

Our research revealed that PhD students of the humanities are most 
frequent users of books, thus subscribed databases are not of high 
importance and freely available information sources relevant for them are 
rather sparse - making up only 18 per cent of all used information sources. 
There are far fewer freely available scholarly books than peer-reviewed 
papers. As Montgomery (2013) pointed out in the open access debate, the 
role that open access might play in helping a deeply inefficient system of 
publishing scholarly books was bestowed little attention. The initiative of 
the Directory of Open Access Books is making the first hard steps and 
numerous unsolved problems still exist fWhitford. 2014! . 

A mere 25 per cent of all necessary sources for PhD students of physical 
sciences could potentially be found freely available. A closer inspection of 
the electronic journals utilised by PhD students of physical sciences in the 
selected theses revealed that most of the journals had high impact factors or 
were in the top 25 per cent of journal titles in a given subject listed in 
Thomson Reuters’ Journal Citation Reports. Most of these journals and 
their articles can be found in highly protected and expensive databases, 
therefore they are not as easily available as free open access resources. 


Percentage of freely available information sources 
identical to those found in subscribed databases. 




The smallest rate of resource overlapping between resources provided by 
Vilnius University Library and freely available resources was detected in the 
field of physical sciences. As it has already been mentioned, the reason 
behind this fact is that PhD students of physical sciences were keener on 
using papers from journals with high impact factors or the top 25 per cent of 
the journal titles in a given subject listed in Thomson Reuters’ JCR. 
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Figure 5: Percentage of freely available information sources identical to those 
found in subscribed databases 

The situation in the field of social sciences is the opposite: data featured in 
Figure 5 suggest that the overlapping of information resources is 69 per 
cent. If we assume that doctoral students start their search using the Google 
search engine, there is a 69 per cent chance that they will find what they 
need without turning their attention to a library eCatalogue or subscribed 
databases. A few examples of earlier research conducted in Lithuania before 
2010 indicate that almost 85 per cent of doctoral students from various 
scientific fields use Google to find full-text papers (Tautkeviciene. 
Duobiniene. Kretaviciene. Kriviene and Petrauskiene. 2010L Interestingly, 
Catalano (2013) discovered that more doctoral students of social sciences 
than any other study programme make use of library resources. 

The peculiar fact suggested by these data is that 58 and 50 per cent of freely 
available sources have identical content with the subscribed databases in the 
fields of technological and biomedical sciences (respectively). Overall, 
research results revealed that the overlapping between Vilnius University 
Library resources and freely available resources is as high as 47 per cent. 

However, as indicated in Figure 4, a considerable number of referenced 
resources remained unknown - 22 per cent on average. Types of resources 
lying under the term unknown where analysed separately for each subject 
field. Enquiry into all subject fields did not disclose a clear way of potential 
access to peer-reviewed papers and printed books, thus suggesting that 
these resources in full-text were not available in the university library 
eCatalogue, subscribed databases or freely online. Hence we can assume 
that students had to use other libraries, purchase or use social contacts to 
obtain needed information. In-depth analysis of the unknown resources in 
each field revealed equal distribution of the most used types of resources 
and the unknown ones. For example, in biomedical sciences the most 


































popular type of information was peer-reviewed papers (88%) which resulted 
in the biggest number of these papers hiding under the unknown category 
(79%). A similar situation is observable in physical sciences - peer-reviewed 
papers were the most frequently used (67%) and made up most (44%) of the 
resources in the unknown category. The social sciences and the humanities 
manifest an analogous situation with printed books. The only exception is 
technological sciences where students used more peer-reviewed papers, 
however, more printed books than papers were listed under the category 
where the way of potential access is unknown, cf. 52 per cent of all the 
unknown resources were printed books and less than 18 per cent were peer- 
reviewed papers. Reports and studies made up the smallest percentage of 
unknown resources ranging from o to 1.7 per cent throughout all the subject 
fields. 

The authors assume that the reasons behind this fact could be as follows: 
PhD students use information resources during their internships or use 
resources provided by libraries in other countries; they use interlibrary loan 
services offered by libraries; at times paper abstracts are sufficient to review 
the examined field and full-text access is optional. The research data suggest 
that combining freely available and unknown sources (figure 6), PhD 
students of all the fields could be able to access more than half of the 
necessary information without using the collections of the home library. In 
the case of technological sciences the percentage is as high as 75 per cent. 
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Figure 6: Percentage of freely available information sources and sources whose 
way of accessing is unknown (no free access and no access through library 
subscribed databases or eCatalogues) 

Conclusions, limitations and further research 

This exploratory study was aimed at answering research questions 
regarding the significance of freely accessible information resources to PhD 
students when writing their doctoral theses. Moreover, the authors were 
concerned about interpreting given results to determine whether academic 
libraries and their collections are relevant for PhD students when writing 
doctoral theses and whether the academic library can assume the leading 
role as a direct intermediator for information users. 

Readdressing the key research question posed at the beginning of this study, 
it is now possible to state that library collections and subscribed databases 
could potentially cover up to 41 per cent of all information resources used in 












































doctoral theses. One of the more significant findings to emerge from this 
study is that on average more than half (57%) of all utilised information 
resources were freely available or could be accessed without using the 
collections of the home library. Approximately 40 per cent of the higher 
quality information resources (excluding Websites and articles from 
newspapers or blogs) relied on when writing theses could be potentially 
accessed without using the collections of the home library. We may presume 
that the library as a direct intermediator for information users is potentially 
important and is irreplaceable only in four out of ten attempts when PhD 
students seek information. 

Below are the findings with regard to each of the questions: 

RQi. How important are library information resources in compiling 
material for doctoral theses? On average the printed collections of Vilnius 
University Library can potentially meet not more than one sixth of the PhD 
students’ information needs. Only PhD students representing the 
humanities find printed library collections an important source of 
information (41% of all used information sources). The research strongly 
suggests that printed library collections are the least important information 
source for PhD students. It should be noted, however, that these results are 
very local and depend heavily on that particular library, therefore further 
analysis would be helpful in finding out how Google Books, open access 
books and other freely available sources overlap with in-house collections 
and with information utilised when writing doctoral theses. 

We can conclude that electronic journals are the most popular information 
source in most research fields. However, on average electronic journals from 
subscribed databases meet only one quarter of the total of PhD students’ 
information needs. It also strongly depends on the number of subscribed 
databases in that particular library. On the other hand, not all content of 
subscribed databases is irreplaceable (with freely accessible information 
resources), as part of the content overlaps with freely accessible information 
sources (we discovered that on the average the overlapping reaches 47 per 
cent). Having this in mind, it is important to establish the quantity of 
information sources provided by subscribed databases that could be 
accessed freely off-campus to get a broader picture of how vital library 
information resources are for doctoral students. 

RQ2. What types of information sources are most often used when writing 
doctoral theses? Research results suggest the importance of peer-reviewed 
papers (almost 50% on average). Usage of books (e-books and printed 
books) on average made up 30 per cent of all information resources. The 
remaining almost 20 per cent of the used information resources typically 
are not collected by the library and in most cases are freely available. They 
include Websites, conference proceedings, newspaper and blog articles, 
maps, statistical data from statistical information departments, etc. The 
above-listed results indicate that potentially the library, as an information 
source provider, could meet almost 80 per cent of all information needs, if it 
had sufficient funds to procure all necessary books and peer-reviewed 
journals which require subscription. 


RQ 3 . What are the potential ways of getting certain information resources 



cited in doctoral theses? Analysis of the potential ways of information 
collection revealed that on average more than half (57%) of all utilised 
information resources are freely available or could be accessed without 
using the collections of the home library. It should be noted that some of the 
freely available resources are not key literature for thesis writing, as 
approximately 10 per cent of the used resources were Websites and articles 
from newspapers or blogs. Thus, a conclusion can be drawn that 
approximately 40 per cent of higher quality information resources used in 
thesis writing could be potentially accessed without using the collections of 
the home library. This is true speaking of all research fields. Another 
important aspect is that about a half (47%) of the content of subscribed 
databases is identical to that which is freely available. 

As to the validity of the results, it has to be emphasised that citation analysis 
was carried out within half a year after the theses had been defended. It is 
possible that some of the used information resources (e.g. publisher's 
versions) were not freely available during the thesis writing period owing to 
embargo periods, but were freely available when our research was carried 
out. On the other hand, we should take into account the amount of grey 
literature which is freely accessible before the point when the publisher's 
version becomes freely available after the embargo period. It is hard to 
distinguish whether doctoral students have used publisher's versions of a 
paper or its pre-prints or post-prints, however in their theses reference was 
made to the publisher's version. Some researchers insist that use of grey 
literature could be an important source of information, as in a scientific 
community it is a common practice to self-archive members' papers (most 
often the revised manuscript instead of the PDF formatted by the journal - 
publisher's version) and to upload them to accounts at ResearchGate.net or 
Academia.edu. As Sitek and Bertelmann (2014) suggest, grey literature has 
always played a role in scholarly communication. Haines, Light, O’Malley 
and Delwiche (2010) have found that science researchers rely a lot on a 
network of peers who can be treated as information sources. Another 
important aspect is that more than 70 per cent of all information resources 
utilised in theses were published two years before the theses were defended. 
Thus, the conclusion may be drawn that the validity of the given results are 
satisfactory for an exploratory pilot study. 

The research was not aimed at grasping the full situation as to how PhD 
students seek information. We tried to find the possible ways of accessing 
information used in doctoral theses and to measure how important the 
library could be as an information resource provider to PhD students. 

In future, we intend to carry out similar research covering bachelor and 
master degree theses citation analysis. This will provide further insights into 
the role of the academic library as an intermediator and will help us 
understand the importance of freely available information resources for 
students. 
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